NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Demonstrating CEDAR: A System for Cost-Efficient Data-Driven Claim Verification

https://doi.org/10.1145/3722212.3725098

Jayasekara, Tharushi; Trummer, Immanuel (June 2025, ACM)

Free, publicly-accessible full text available June 22, 2026
Demonstrating SQLBarber: Leveraging Large Language Models to Generate Customized and Realistic SQL Workloads

https://doi.org/10.1145/3722212.3725101

Lao, Jiale; Trummer, Immanuel (June 2025, ACM)

Free, publicly-accessible full text available June 22, 2026
SwellDB: Dynamic Query-Driven Table Generation with Large Language Models

https://doi.org/10.1145/3722212.3725136

Giannakouris, Victor; Trummer, Immanuel (June 2025, ACM)

Free, publicly-accessible full text available June 22, 2026
SpareLLM: Automatically Selecting Task-Specific Minimum-Cost Large Language Models under Equivalence Constraint

https://doi.org/10.1145/3725356

Jo, Saehan; Trummer, Immanuel (June 2025, Proceedings of the ACM on Management of Data)

We introduce SpareLLM, SelectingPassableAndResource-EfficientLLMs, a novel LLM framework designed to minimize the inference costs (i.e., resource-efficient) of large-scale NLP tasks while ensuring sufficient result quality (i.e., passable). It enables users to specify an equivalence constraint in terms of the equivalence of outputs to those of the most powerful LLM. SpareLLM then generates results that deviate from the outputs of this LLM only with a probability below a user-defined threshold. SpareLLM employs a profiling phase that evaluates the performance of multiple LLMs to identify those that meet the user-defined equivalence level. It optimizes the tradeoff between profiling overheads and the anticipated cost savings resulting from profiling. Moreover, SpareLLM further reduces inference costs by strategically leveraging a mix of LLMs. Our experiments on five real-world datasets show that SpareLLM achieves significant cost savings, up to 8.6x, while generating equivalent outputs in 90% of cases compared to GPT-4-Turbo. Compared to recent LLM cascading baselines, SpareLLM demonstrates a superior tradeoff between cost and accuracy, accounting for 91.1% and 83.8% of the points on the Pareto curve for OpenAI and Llama models.
more » « less
Free, publicly-accessible full text available June 17, 2026
Generating highly customizable python code for data processing with large language models

https://doi.org/10.1007/s00778-025-00900-4

Trummer, Immanuel (March 2025, The VLDB Journal)

Free, publicly-accessible full text available March 1, 2026
Customizing Operator Implementations for SQL Processing via Large Language Models

Trummer, Immanuel (March 2025, A Quarterly bulletin of the Computer Society of the IEEE Technical Committee on Data Engineering)

Free, publicly-accessible full text available March 1, 2026
λ-Tune: Harnessing Large Language Models for Automated Database System Tuning

https://doi.org/10.1145/3709652

Giannakouris, Victor; Trummer, Immanuel (February 2025, Proceedings of the ACM on Management of Data)

We introduce λ-Tune, a framework that leverages Large Language Models (LLMs) for automated database system tuning. The design of λ-Tune is motivated by the capabilities of the latest generation of LLMs. Different from prior work, leveraging LLMs to extract tuning hints for single parameters, λ-Tune generates entire configuration scripts, based on a large input document, describing the tuning context. λ-Tune generates alternative configurations, using a principled approach to identify the best configuration, out of a small set of candidates. In doing so, it minimizes reconfiguration overheads and ensures that evaluation costs are bounded as a function of the optimal run time. By treating prompt generation as a cost-based optimization problem, λ-Tune conveys the most relevant context to the LLM while bounding the number of input tokens and, therefore, monetary fees for LLM invocations. We compare λ-Tune to various baselines, using multiple benchmarks and PostgreSQL and MySQL as target systems for tuning, showing that λ-Tune is significantly more robust than prior approaches.
more » « less
Free, publicly-accessible full text available February 10, 2026
Generating Succinct Descriptions of Database Schemata for Cost-Efficient Prompting of Large Language Models

https://doi.org/10.14778/3681954.3682017

Trummer, Immanuel (July 2024, Proceedings of the VLDB Endowment)

Using large language models (LLMs) for tasks like text-to-SQL translation often requires describing the database schema as part of the model input. LLM providers typically charge as a function of the number of tokens read. Hence, reducing the length of the schema description saves money at each model invocation. This paper introduces Schemonic, a system that automatically finds concise text descriptions of relational database schemata. By introducing abbreviations or grouping schema elements with similar properties, Schemonic typically finds descriptions that use significantly fewer tokens than naive schema representations. Internally, Schemonic models schema compression as a combinatorial optimization problem and uses integer linear programming solvers to find guaranteed optimal or near-optimal solutions. It speeds up optimization by starting optimization from heuristic solutions and reducing the search space size via pre-processing. The experiments on TPC-H, SPIDER, and Public-BI demonstrate that Schemonic reduces schema description length significantly, along with fees for reading them, without reducing the accuracy in tasks such as text-to-SQL translation.
more » « less
Full Text Available
Demonstrating λ-Tune: Exploiting Large Language Models for Workload-Adaptive Database System Tuning

https://doi.org/10.1145/3626246.3654751

Giannakouris, Victor; Trummer, Immanuel (June 2024, ACM)

Full Text Available
ThalamusDB: Approximate Query Processing on Multi-Modal Data

https://doi.org/10.1145/3654989

Jo, Saehan; Trummer, Immanuel (May 2024, Proceedings of the ACM on Management of Data)

We introduce ThalamusDB, a novel approximate query processing system that processes complex SQL queries on multi-modal data. ThalamusDB supports SQL queries integrating natural language predicates on visual, audio, and text data. To answer such queries, ThalamusDB exploits a collection of zero-shot models in combination with relational processing. ThalamusDB utilizes deterministic approximate query processing, harnessing the relative efficiency of relational processing to mitigate the computational demands of machine learning inference. For evaluating a natural language predicate, ThalamusDB requests a small number of labels from users. User can specify their preferences on the performance objective regarding the three relevant metrics: approximation error, computation time, and labeling overheads. The ThalamusDB query optimizer chooses optimized plans according to user preferences, prioritizing data processing and requested labels to maximize impact. Experiments with several real-world data sets, taken from Craigslist, YouTube, and Netflix, show that ThalamusDB achieves an average speedup of 35.0x over MindsDB, an exact processing baseline, and outperforms ABAE, a sampling-based method, in 78.9% of cases.
more » « less
Full Text Available

« Prev Next »

Search for: All records